Improvements to profile PSTMM for glycan recognition profile prediction

نویسندگان

  • Sakiko Kaiya
  • Masae Hosoda
  • Kiyoko F. Aoki-Kinoshita
چکیده

Glycans are biomolecules that are composed of various monosaccharides, and they are bound to proteins and lipids on the cell surface. Glycans are known as key players of biological phoenomena such as in the determination of blood type, cell adhesion, antigen-antibody reactions, virus infections, etc. It is known that more than half of the proteins in major protein structure database such as SWISS-PROT are glycosylated. Unlike proteins, however, glycans are tree structures which vary across organs, tissues, and even cells. Lectins are glycan-binding proteins that recognize monosaccharides at the non-reducing end, and it is believed that they may also recognize sugars further along the glycan chain. To capture these potential recognition structures, PSTMM (Probabilistic Sibling-dependent Tree Markov Model), which considers relationships between “sibling” monosaccharides within glycans, was developed. Furthermore, a profile version of PSTMM, called profile PSTMM, was developed [1], which has been implemented as a web tool last year. Profile PSTMM requires the definition of a “state model” which defines the structure of the profile to be learned from the training data. The state model of the profile PSTMM web tool is currently generated based on the maximum common subtree (MCST) of all input glycan structure. However, this results in a very small state model. In order to compensate for this, we are developing a method to multiply align all glycan structures and extract glycan substructure blocks based on a method similar to ClustalW. We introduce this new method of multiple tree alignment here.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Isolated Persian/Arabic handwriting characters: Derivative projection profile features, implemented on GPUs

For many years, researchers have studied high accuracy methods for recognizing the handwriting and achieved many significant improvements. However, an issue that has rarely been studied is the speed of these methods. Considering the computer hardware limitations, it is necessary for these methods to run in high speed. One of the methods to increase the processing speed is to use the computer pa...

متن کامل

An Automatic Glyco-Workflow Generator in RINGS

Recently, databases of glycans have increased in the bioinformatics field. RINGS [1] is a resource that requires a comprehensive glycan database such that the glycobiologist can freely use glycan analysis tools with convenience. RINGS contains utility tools, most of which are conversion utilities for different glycan file formats. For example, there are the conversion tools that convert KEGG Gl...

متن کامل

Prediction of Temperature Profile of a Buried Gas Pipeline Through Utilization of Corresponding States Principle

A new analytical equation for prediction of temperature profile of a buried gas pipeline is developed. Utility of this equation is illustrated by its application to corresponding states principle. The resulting equation is tested through prediction of the actual Schorre data. It is shown that the new equation can predict temperature profile more accurately than the others without using any char...

متن کامل

Multivariate Feature Extraction for Prediction of Future Gene Expression Profile

Introduction: The features of a cell can be extracted from its gene expression profile. If the gene expression profiles of future descendant cells are predicted, the features of the future cells are also predicted. The objective of this study was to design an artificial neural network to predict gene expression profiles of descendant cells that will be generated by division/differentiation of h...

متن کامل

Application of a new probabilistic model for recognizing complex patterns in glycans

MOTIVATION The study of carbohydrate sugar chains, or glycans, has been one of slow progress mainly due to the difficulty in establishing standard methods for analyzing their structures and biosynthesis. Glycans are generally tree structures that are more complex than linear DNA or protein sequences, and evidence shows that patterns in glycans may be present that spread across siblings and into...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009